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(57) ABSTRACT 

A technique for implicitly encoding shape information by 
using a chroma -key color. A bounding box is created enclos- 
ing the video object. The bounding box is extended to be of 
size of next integer multiple of macroblock size and divided 
into a plurality of raacroblocks. For each boundary 
macroblock, each pixel outside the object is replaced with 
the chroma-key color to implicitly encode shape informa- 
tion. Pixel data for boundary macroblocks and macroblocks 
inside the object are DCT transformed, scaled and motion 
compensated. A finer quantizer (smaller quantizer) is used 
for boundary macroblocks to improve image quaUty. A 
first_shape_code can be used to identify each macroblock 
as either 1) inside the object; 2) outside the object; or 3) on 
the object boundary. To improve data compression and 
achieve low complexity shape extraction with DCT and 
motion compensation, a first_shape_code is sent for all 
macroblocks, and only macroblocks that are inside the 
object or on the object boundary are coded. The decoding 
system decodes the first_shape_codc and, if necessary, the 
DCT and motion compensation information. The motion 
compensated luminance and chrominance pixel values of a 
reconstructed object at the decoding system are compared to 
the chroma-key color and thresholds to reconstruct the shape 
of the object, and to output texture information of the video 
object. 

18 Claims, 7 Drawing Sheets 
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FIG. 5 



SEGMENT OBJECT IN VIDEO FRAME 



CREATE BOUNDING BOX FOR OBJECT 



FOR EACH BOUNDARY BLOCK. REPLACE EACH PIXEL 
OUTSIDE THE OBJECT WITH A CHROMA-KEY COLOR K 



WITHIN THE BOUNDING BOX, IDENTIFY WHICH MACROBLOCKS ARE, 

a) OUTSIDE THE OBJECT; 

b) INSIDE THE OBJECT; 

c) ON THE BOUNDARY; 



PERFORM MOTION COMPENSATION AND CALCULATE 
MOTION VECTORS FOR SOME OF THE MACROBLOCKS 
INSIDE THE OBJECT OR ON THE BOUNDARY 



CODE LUMINANCE AND CHROMINANCE VALUES FOR 
EACH PIXEL FORM MACROBLOCKS INSIDE THE 
OBJECT AND ON THE BOUNDARY 
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OUTPUT AN ENCODED BITSTREAM FOR THE OBJECT INCLUDING 
TRANSFORMED AND QUANTIZED LUMINANCE AND CHROMINANCE 
VALUES FOR EACH CODED MACROBLOCK, MOTION VECTORS. 
ADDITIONAL INFORMATION. CHROMA-KEY AND CODES 
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DECODE VARIABLE LENGTH CODES 
REPRESENTING DCT COEFFICIENTS, MOTION 
VECTORS, AND MARCOBLOCK LOCATION 




INVERSE DCT TRANSFORM THE DCT 
COEFFICIENTS AND GENERATE MOTION 
COMPENSATED LUM. AND CHROM. VALUES 
BASED ON THE MOTION VECTORS 



DECODE CHROMA-KEY AND THRESHOLDS 



RECOVER THE RECONSTRUCTED SHAPE OF 
THE OBJECT AND THE SHAPE OF THE OBJECT 
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CHROMA-KEY FOR EFFICIENT AND LOW 
COMPLEXITY SHAPE REPRESENTATION 
OF CODED ARBITRARY VIDEO OBJECTS 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

This application claims priority from U.S. Provisional 
Application Sen No. 60/052.971, filed on Jul. 18, 1997. This 
application is a continuation-in-part of co-pending applica- 
tion Ser, No. 08/801,716, filed on Feb. 14, 1997 entitled 
"Method and Apparatus for Coding Segmented Regions 
Which May Be Transparent In Video Sequences For 
Content-Based Scalability," incorporated by reference 
herein. 

BACKGROUND OF THE INVENTION 

'Ilie present invention relates to the field of digital video 
coding technology and, more particularly, to a method and 
apparatus for providing an improved chroma-key shape 
representation of video objecLs of arbitrary shape. 

A variety of protocols for communication, storage and 
retrieval of video images are known. Invariably, the proto- 
cols are developed with a particular emphasis on reducing 
signal bandwidth. With a reduction of signal bandwidth, 
storage devices are able to store more images and commu- 
nications systems can send more images at a given commu- 
nication rate. Reduction in signal bandwidth increases the 
overall capacity of the system using the signal. 

However, bandwidth reduction may be associated with 
particular disadvantages. For instance, certain known coding 
systems are lossy because they introduce errors which may 
alfect the perceptual quality of the decoded image. Others 
may achieve significant bandwidth reduction for certain 
types of images but may not achieve any bandwidth reduc- 
tion for others. Accordingly, the selection of coding schemes 
must be carefully considered. 

The Motion Picture Expert Group (MPEG) has success- 
fully introduced two standards for coding of audiovisual 
information, known by acronyms as MPEG-1 and MPEG-2. 
MPEG is currently working on a new standard, known as 
MPEG-4. MPEG-4 video aims at providing standardized 
core technologies allowing efficient storage, transmission 
and manipulation of video data in multimedia environments. 
A detailed proposal for MPEG-4 is set forth in MPEG-4 
Video Verification Model (VM) 5.0, hereby incorporated by 
reference. 

MPEG-4 considers a scene to be a composition of video 
objects. In most applications, each video object represents a 
.semantical! y meaningful object. Each uncompressed video 
object is represented as a set of Y, U, and V components 
(luminance and chrominance values) plus information about 
its shape, stored frame after frame in predefined temporal 
intervals. Each video object is separately coded and trans- 
mitted with other objects. As described in MPEG-4, a video 
object plane (VOP) is an occurrence of a video object at a 
given time. For a video object, two different VOPs represent 
snap shots of the same video object at two different times. 
For simplicity we have often used the term video object to 
refer to its VOP at a specific instant in time. 

As an example, FIG. 1(A) illustrates a frame for coding 
that includes a head and shoulders of a narrator, a logo 
suspended within the frame and a background. FIGS. 1(B) 
-1(D) illustrate the frame of FIG. 1(A) broken into three 
VOPs. By convention, a background generally is assigned 
VOP0. The narrator and logo may be assigned VOPl and 
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V0P2 respectively. Within each VOP, all image data is 
coded and decoded identically. 

The VOP encoder for MPEG-4 separately codes shape 
information and texture (luminance and chrominance) infor- 
mation for the video object. The shape information is 
encoded as an alpha map that indicates whether or not each 
pixel is part of the video object. The texture information is 
coded as luminance and chrominance values. Thus, the VOP 
encoder for MPEG-4 employs explicit shape coding because 
the shape information is coded separately from the texture 
information (luminance and chrominance values for each 
pixel). While an explicit shape coding technique can provide 
excellent results at high bit rates, explicit shape coding 
requires additional bandwidth for carrying shape informa- 
tion separate from texture information. Moreover, results are 
unimpressive for the explicit shape coding at low coding bit 
rates because significant bandwidth is occupied by expUcit 
shape information, resulting in low quality texture recon- 
struction for the object. 

As an alternative to explicitly coding shape information, 
implicit shape coding techniques have been proposed in 
which shape information is not explicitly coded. Rather, in 
implicit shape coding, the shape of each object can be 
ascertained based on the texture information. ImpUcit shape 
coding techniques provide a simpler design (less complex 
than explicit technique) and a reasonable performance, par- 
ticularly at lower bit rates. Implicit shape a)ding reduces 
signal bandwidth because shape information is not explicitly 
transmitted. As a result, implicit shape coding can be par- 
ticularly important for low bit rate applications, such as 
mobile and other wireless applications. 

However, implicit shape coding generally does not per- 
form as well as explicit shape coding, particularly for more 
demanding scenes. For example, objects often contain color 
bleeding artifacts on object edges when using implicit shape 
coding. Also, it can be difficult to obtain lossless shapes 
using the implicit techniques because shape coding quality 
is determined by texture coding quality and is not provided 
explicitly. Therefore, a need exists for an improved implicit 
shape coding technique, 

SUMMARY OF TOE INVENTION 

The system of the present invention can include an 
encoding system and a decoding system that overcomes the 
disadvantages and drawbacks of prior systems. 

An encoding system uses chroma-key shape coding to 
implicitly encode shape information with texture informa- 
tion. The encoding system includes a boundary box genera- 
tor and color replacer, a OCT encoder, a quantizer, a motion 
estimator/compensator and a variable length coder. A video 
object to be encoded is enclosed by a bounding box and only 
macrob locks in the bounding box are processed to improve 
data compression. Each macroblock inside the bounding box 
is identified as either 1) outside the object; 2) inside the 
object; or 3) on the object boundary. Macroblocks outside 
the object are not coded to further improve data compres- 
sion. For boundary macroblocks, pixels located outside the 
object (background pixels) arc replaced with a chroma-key 
color K to implicitly encode the shape of the object. The 
luminance and chrominance values for macroblocks inside 
the object and on the object boundary are coded, including 
transforming the luminance and chrominance values to 
obtain DCT coefificients, and quantizing (scaling) the DCT 
coefficients. Motion compensation can also be performed on 
some macroblocks to generate motion vectors. In addition, 
to improve image quality, boundary macroblocks can be 
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quantized at a finer level than other macroblocks in the 
bounding box. A bitstream is output from the encoding 
system. The bitstream can include the encoded macroblock 
pixel data, a code identifying the position (e.g., inside, 
outside or on the boundary) of each coded macroblock, the 
chroma-key value and thresholds, motion vectors and one or 
more quantizers. Where a finer quantization is applied to 
boundary macroblocks, the bitstream also includes a code 
indicating the exact quantizer used for boundary macrob- 
locks and a code indicating the number of quantization 
levels for macroblocks inside the object. 

A decoding system includes a variable length decoder, an 
inverse quantizer, a motion compensator, an inverse DCTr 
coder, and color extractor and shape mask detector. A 
bitstream is received and decoded by the decoding system to 
obtain both texture information (e.g., luminance and chromi- 
nance data) and shape information for a video object. The 
shape information is impUcitly encoded. DCT coefiBcients 
and motion vectors for each macroblock arc inverse quan- 
tized (rescaled) based on the codes (quantizers) identifying 
the specified quantizer or the specified number of quantiza- 
tion levels for each. The reconstructed video object is 
obtained by passing only the pixel values for the object (e.g., 
by rejecting pixel values within a predetermined range of the 
chroma-key). JTie shape of the video object is obtained by 
generating a binary map or shape mask (e.g.. Is or Os) 
identifying each pixel as either inside the object or outside 
the object A gray-scale map (shape mask) can be generated 
instead by using two thresholds to soften the object bound- 
aries. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1(A) illustrates an example frame for coding. 

FIGS. 1(B)-1(D) illustrate the frame of FIG. 1(a) broken 
into three Video Object Planes. 

FIG. 2 A is a block diagram illustrating an encoding 
system according to an embodiment of the present inven- 
tion. 

FIG. 2B is a block diagram of a decoding system accord- 
ing to an embodiment of the present invention. 

FIG. 3 illustrates an example of a bounding box bounding 
a video object according to an embodiment of the present 
invention. 

FIG. 4 illustrates an example of a video object according 
to an embodiment of the present invention. 

FIG. 5 is a flow chart illustrating the operation of an 
encoding system according to an embodiment of the present 
invention. 

FIG. 6 is a flow chart illustrating the operation of a 
decoding system according to an embodiment of the present 
invention. 

DETAILED DESCRIPTION 

Referring to the drawings in detail, wherein like numerals 
indicate like elements, FIG. 2A is a block diagram iUustral- 
ing an encoding system according to an embodiment of the 
present invention. FIG. 2B is a block diagram of a decoding 
system according to an embodiment of the present inven- 
tion. 

Encoding system 202 uses chroma-key shape coding to 
implicitly encode shape information. According to the 
present invention, an encoding system 202 (FIG. 2A) 
receives a video picture or frame including a segmented 
video object as an input signal over line 204, representative 
of a VOP to be coded (e.g., includes the object and some 
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background). The input signal is sampled and organized into 
macroblocks which are spatial areas of each frame. The 
encoding system 202 codes the macroblocks and outputs an 
encoded bitstream over a line 220 to a channel 219. The 

5 channel 219 may be a radio channel, a computer network or 
some storage media such as a memory or a magnetic or 
optical disk. A decoding system 230 (FIG. 2B) receives the 
bitstream over line 228 fi"om the channel 219 and recon- 
structs a video object therefrom for display. 

jQ Encoding system 202 includes a bounding box generator 
and color replacer 206 for generating a bounding box around 
the segmented video object and for replacing pixel values 
located outside the object boundary with a predetermined 
key color (or chroma-key color) K, according to an embodi- 

j5 ment of the present invention. The chroma-key color and 
some threshold values are output on hne 203 by bounding 
box generator and color replacer 206, According to an 
embodiment of the present invention, instead of enclosing 
each video object in a full size picture and processing all 

20 macroblocks in the received full size picture, the video 
object can advantageoasly be enclosed by a bounding box 
and only macroblocks in the bounding box are processed 
(e,g., only pixel data is passed for macroblocks inside the 
bounding box). According to an embodiment of the present 

25 invention, the position of the bounding box is chosen such 
that it contains a minimum number of 16 pixelxl6 pixel 
macroblocks (while bounding the video object). As a resuU, 
processing time is reduced. In this manner, bounding box 
generator and color replacer 206 implicitly encodes infor- 

3Q mation describing the shape of the video object in the texture 
(luminance and chrominance values) information for the 
object. According to an embodiment of the present 
invention, the bounding lx)X generator and color replacer 
206 outputs signals on line 201 including texture informa- 

35 tion (pixel values) for the object (for pixels inside the object 
boundary), and outputs the chroma-key pixel value for 
pixels outside the object boundary (because these pixels 
outside the object boundary were replaced with the chroma- 
key color). 

40 ^VhG output of generator and color replacer 206 is coupled 
via line 201 to a macroblock formatter and mode decider 
207. Macroblock formatter and mode decider 207 divides 
the video object into macroblocks (MBs), determines 
whether each MB is inside the boundary (of the video 

45 object), outside the boundary, or on the boundary (e.g., 
having pixels inside and pixels outside the boundary of the 
object), known as the mode. The macroblock formatter and 
mode decider 207 then outputs a first_shape_code for each 
macroblock identifying the mode of each macroblock. 

50 In addition, according to an embodiment of the present 
invention, macroblock formatter and mode decider 207 also 
operates like a filter because it outputs pixel data on line 208 
(to be encoded) only for macroblocks that are either inside 
the boundary or on the boundary (pixel data are not output 

55 for macroblocks outside the boundary). The first__shape_ 
code is generated for each macroblock and identifies those 
macroblocks for which no pixel data is transmitted. Thus, 
data compression and encoding speed are improved because 
pixel data for macroblocks outside the boundary will not be 

60 encoded and transmitted. 

The pixel data on line 208 (including texture information 
or pixel values for the pixels inside the object boundary, and 
the replaced chroma-key values for those pixels outside the 
object boundary and inside the bounding box) is input to a 

65 difference circuit 215 and to a motion estimator/ 
compensator 209. Motion estimator/compensator 209 gen- 
erates a motion predicted signal that is output on line 225. 
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Difference circuit 215 subtracts the motion predicted signal on line 241) and the chroma-key color and threshold values 

(on line 225) and the pixel data (on line 208) to output pixel (output on line 245). 

difference values on line 213. The scaled DCT coefficients are input via line 234 and the 

The pixel (image) difference values are input to a DCT bound_quant value is input via line 241 to inverse quantizer 

encoder 210 via line 213. DCT encoder 210 performs a 5 236. Inverse quantizer 236 rescales the DCT coefficients 

transformation of the image data, such as discrete cosine according to the quanti^r which is a constant or vanaMe 

transform ("DCT^ coding or sub-band coding, from the ^^^^^^ (^p ^[ Yf*^"^;;"";): '"P^' .Ttf 

• , t • J i_ • 1 \» r*r^ cients can be rescaled by multiplymg each coeflicieni by the 

pixel values (lummance and chrominance values) to DCT ^^^^^^ standard quantizer and the 

coefficients (frequency domam) A block of pixels is trans- 2ound_quant value can be transmitted with the fitslream). 

formed to an equivalently sized block of DCT coefficients. lO ^^^^^^ quantizer 236 increases a number of quantization 

DCT encoder 210 outputs DCT coefficients (corresponding j^^^jg available for encoding the signal (e.g., back to the 

to the pixel data) on line 212. original number of quantization levels). Inverse quantizer 

A quantizer 214 is connected via line 212 to DCT encoder 236 may use one quantizer for macroblocks inside the 

210. Quantizer 214 scales or quantizes the DCT coefficients boundary (e.g., VOP_quant), and a finer quantizer (bound_ 

output on line 212 by dividing each coefficient by a prede- quant) for boundary macroblocks. The same quantizers used 

termined quantizer. The quantizer is a constant or variable at the encoding system 202 are also used at the decoding 

scalar value (Q^). For example, the DCT coefficients can be system 230. 

quantized by dividing each coefBcient by the quantizer (Q^). Inverse DCT encoder 240 performs an inverse DCT 

In general, the quantizer 214 reduces bandwidth of the transform on the DCT coefiBcients received as an input via 

image signal by reducing a number of quantization levels 20 jj^e 237 to output pixel values (luminance and chrominance 

available for encoding the signal. The quantization process values) for each macroblock on hne 246. 

is lossy. Many small DCT coefficients input to the quantizer p^^. ^^^^ macroblocks that were coded using motion 

214 are divided down and truncated to zero. The scaled compensation, the motion predicted signal provided on line 

signal (scaled or quantized DCT coefficients) IS output from 244 (output from motion compensator 242) is added by 

quantizer 214 via line 216. ^^^^j. ^.^^^^ 248 to the pixel values on line 246 to output the 

Usually, the same quantizer(a VOP_quant) is used to reconstructed pixel values for each macroblock on line 251. 

quantize DCT coefficients for all macroblocks of a VOP. Motion compensator 242 generates the motion predicted 

However, according to an embodiment of the present pixel signal on line 244 based on the reconstructed pixel 

invention, certain macroblocks (e.g. boundary macroblocks) signal on line 251 and the motion vectors for each macrob- 

can be quantized using a smaller quantizer to better define ig^k received via line 222. 

the boundary or edge of an object. A special quantizer for reconstructed pixel signal is input via line 251 to a 

boundary macroblocks (indicated by bound_quant) is used ^^^^^ extractor and shape mask generator 249. The color 

for boundary macroblocks. The boundary quantizer is speci- extractor and shape mask generator 249 also receives as 

fied by the bound_quant signal, which is output on line 217 -^p^^g chroma-key color and thresholds (via line 245) 

from quantizer 214, and the first_shape_code (via line 243). The color extractor 

An inverse quantizer and inverse DCT encoder (e.g., a and shape mask generator 249 compares each pixel value (in 

DCT decoder) receives the scaled DCT coefficients via line the reconstructed pixel signal) to the chroma-key value (or 

216, inverse quantizes the DCT coefficients and then con- a range of values near the chroma-key color). By comparing 

verLs the DCT coefficients to pixel values to generate the the pixel values to the chroma-key value, the color extractor 

pixel difference values, output on line 223. and shape mask generator 249 can determine which pixels 

An adder circuit 224 receives as inputs, the pixel differ- are located within the object and which pixels are located 

ence signal via line 223 and the motion predicted signal 225 outside of the object and thereby identify the original shape 

(from motion estimator/compensator 209). Adder circuit 224 of the object in the VOP. The pixels located within the object 

generates an approximate value of the input signal (provided 45 arc output via line 252 as a reconstructed video object 

on line 208). TOs approximation signal, output on line 226, (stripping off the chroma-key or background pixels to output 

is the current frame data and is input to motion estimator/ object pixel values). Also, color extractor and shape mask 

compensator 209 to be used as a predictor for the next frame. generator 249 generates and outputs a shape mask identify- 

Motion estimator/compensator 209 performs motion ing the shape of the video object. The shape mask can be 

estimation/compensation to output the motion predicted 50 generated as a binary map (e.g., a 1 or 0 for each pixel) or 

signal on line 225 and motion vectors (MV) based on the a gray scale map identifying whether each pixel is either 

pixel data input on line 208 and the approximation of the inside or outside the video object. The shape mask is output 

pixel data input on line 226. Motion vectors (MV) for one via line 254 and can be used, for example, by a compositer 

or more macroblocks are output via line 211. to combine multiple video objects into a single (multi- 

A variable length coder 218 variable length codes the 55 object) frame, 

scaled DCT coefficients (input on line 216), motion vectors The above-described chroma-key shape coding technique 

(MVs input on line 211), the chroma-key color and thresh- of the present invention provides a simple and efficient 

olds (input on line 203) and the bound__quant value (input method for video shape coding. Furthermore, the present 

on line 217) into a bitstream. llie bitstream is output via line invention includes several additional features and advan- 

220 to channel 219 for transmission. 60 lages that further improve or refine the above-described 

Decoding system 230 (HO. 2B) receives the encoded chroma-key shape coding technique without adding unjus- 

bitstream from channel 219 via line 228. A variable length tifiable overhead or complexity. The present invention can 

decoder 232 variable length decodes the encoded bitstream include one or more of the foUowing features: 

into scaled DCT coefficients (output on line 234) for each 1. Bounding Box: Process Only Macroblocks Inside the 

macroblock, motion vectors for each macroblock (MVs 65 Bounding Box: 

output on line 222), the first_shape_codes for each mac- Instead of enclosing each video object in a fxill size picture 

roblock (output on line 243), the bound_quant value (output and processing all macroblocks in the picture, the video 
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object can advantageously be enclosed by a bounding box macroblock is located on the boundary, outside the object, or 

and only macroblocks in the bounding box are processed. inside the object. 

Prior to calculating a bounding box around the object, the 3 within the Boundino Box, Apply Chroma-Keying Only 

object is first segmented from the video frame Any of Boundary Macroblocks: Background Macroblocks Are 

several well-known segmentation techniques can be used to 5 xt . j j n ui 1 u ^ /m . « ^ 

segment the video object from the remainder of the video Not Coded: For macroblocks on the boundary (identified, 

frame. The position of the bounding box is chosen such that ^r example, by the first_shape_code), pixels outside the 

it contains a minimum number of 16 pixelxl6 pixel mac- object (e.g., background pixels) are replaced with the 

roblocks. The encoding/decoding process is performed on a chroma-key color. This chroma-key replacement of back- 

macroblock basis. In this manner, processing time can be ground pixels is performed only for boundary macroblocks. 

reduced. , Replacing the background pixels in the boundary macrob- 

FIG. 3 illustrates an example of a bounding box JlOthat , , .u u 1 i •*! ^ - -.u„«« -^f ^ 

, u- .-jic'ii u r u am- r i ,i locks With the chroma-key implicitly codes shape 1 nforma- 

bounds a video object 315. nie bounding box 310 IS divided . ^ . fi » l l- / j 

into a plurality of macroblocks 320. (This is similar to the ^^on for the object. Also, macroblocks outside the object (and 

bounding box used in the exphcit shape coding of MPEG-4 within the bounding box) are not coded. 

Verification Model (VM).) As a result, macroblocks within After chroma-key pixel replacement, only blocks inside 

the bounding box 310 are either 1) inside the object 315 the object or on the boundary are coded (e.g., DCT 

(where the macroblock is completely inside the object); 2) transformed, quantized, and variable length coded for 

outside the object 315 (where the macroblock is completely transmission). By not coding macroblocks located outside 

outside the object); or 3) on the object boundary (e.g., the . . , /■ / /u 1 a u^ ^ \ r, 

macroblock has b<lpixel(s) inside ,he object and pixd(s) 20 'he video objecn (backgraund macroblocks) a s.gmlicanl 

outside the object) number of overhead bits can be saved, thereby increasing 

FIG. 4 illustrates an example of a video object according ^^ta compression, 

to an embodiment of the present invention. Video object 405 In addition, information should be sent identifying those 

is bounded by a bounding box (not shown). The bounding macroblocks inside the bounding box and outside the object 

box is divided into a plurality of macroblocks. Some of the 25 (and thus, identifying those macroblocks that were not 

macroblocks are illustrated in FIG. 4. For example, mac- coded). An additional bit can be added to the first_shape__ 

roblocks MBl, MB2 and MBS are outside the video object code to identify those macroblocks that are within the 

405. Macroblocks MBll, MB 12 and MB 14-16 are located bounding box but outside the object (identifying those 

inside the video object 405. Also, macroblocks MB3, MB4, macroblocks that are not coded). 

MB6, MB7, MB9 and MB13 are on the object boundary. 30 4 Bound_quant: Use a Finer Quantization for Boundary 

2. First_shape_code: For each macroblock in the bound- Macroblocks: 

ing box, the present invention can use a first_shape_code further improve image quality, a finer quantization can 

to identify whether the macroblock is: ^^j. boundary macroblocks, as compared to the 

a) outside the object; quantization for the other macroblocks in the bounding box. 

b) inside the object; or This can be done by quantizer 214 (FIG. 2A) scahng or 

c) on the object boundary. A first__shape__code is trans- quantizing the DCT coefficients output on line 212 accord- 
mitted with the data for each macroblock. (For those to a smaller quantizer for the boundary macroblocks. 
macroblocks outside the boundary, only the firsl_ Therefore, quantizer 214 uses a larger number of quantiza- 
shape_code will be transmitted). ^-^^ ^^^^^^ smaller quantizer) for the boundary 

The first_shape_code can be implemented several dif- macroblocks resulting in finer quamization of the boundary 

ferent ways. Two examples ot a first_shape_code are macroblocks. Because bandwidth is limited, using a larger 

descnbed below: number of quantization levels (e.g., a quantizer less than 1) 

for the boundary macroblocks allocates or apportions a 
larger number of the available bits (bandwidth) to boundary 
macroblocks to better define the outer edge or boundary of 
the video object. 

According to an embodiment of the present invention, a 
finer quantization for boundary blocks can be specified 

^ . . 50 through a boundary quantization code (bound_quant). In 

In Table 1, first_shape_code is a 1 bit code that indicates j^p^G^ vM, a VOP quantization code (VOP_quant) is a 

whether the macroblock is outside the video object or not. A ^^^^ ^.^^^^ quantization for the VOP. In 

first shape„code of 0 indicates that the niacroblo^^^^ coefficients are divided by the VOP quanti- 

ouLside '^l^^^^^^' /)^^^^ zation code. According to the present invention, the back- 

the macroblock IS either inside the object or on the boundary. 55 ^^"^ 1 u j- u .11 

ground macroblocks within the bounding box are not coded. 

TABLE 2 Therefore, according to an embodiment of the present 

invention, VOP_quant specifies the number of quantization 

levels for macroblocks inside the object and bound_quanl 

specifies the number of quantization levels for boundary 

macroblocks. 

According to an embodiment of the present invention, the 
bound__quant code can be used to specify the level of 
In Table 2, the first_shape_code is transmitted as a two quantization for boundary macroblocks relative to the level 
bit code, 'llie two bits can be used to identify whether the of quantization for the other macroblocks, as follows: 





TABLE 1 


45 


fi rst_shape_codc 


Macroblock Shape 




0 


an_0 (outside the object) 




1 


others (inside or on boundary) 





first_shape_code 


Macroblock Shape 




0 


lx)undar>' 


60 


10 


all_0 (outside the object) 




11 


all_255 (inside the object) 
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times VOP_quant 


00 


1/2 


01 


5/8 


10 


7/8 


11 


1 



US 6,208,693 Bl 
9 10 

performed for all pixels in the picture or frame or performed 
TABLE 3 only for boundary macroblocks. 

At step 525, within the bounding box, each raacroblock 
formatted and is identified as either: 1) outside the video 
object (a background macroblock); 2) inside the object; or 3) 
on the object boundary. A code for each macroblock, such as 
the first_shape_code, is used to identify the position of 
each macroblock (inside, outside or on the object boundary). 
At step 530, motion compensation is performed on at least 
In Table 3, a bound_quant code indicates the quantization lO some of the boundary or inside macroblocks, including 
parameter for boundary macroblocks as compared to the calculating motion vectors. Motion vectors are calculated 
quantization parameter of other macroblocks. For example, only for those macroblocks coded with motion compensa- 
a bound_quant of 11 indicates that the quantization param- tion. 

eter for boundary macroblocks is the same as (one limes) the At step 535, the luminance and chrominance (pixel) 

quantization parameter for other macroblocks in the bound- i5 values for boundary macroblocks and macroblocks located 
ing box (the VOP_quant). This indicates that there are the inside the object are coded. According to the present 
same quantization parameter for the boundary macroblocks invention, macroblocks outside the object (e.g., background 
as for other macroblocks. macroblocks) are not coded. Thus, in the event that all pixels 

A bound_quant code of 00 similarly indicates that the (including pixels outside the bounding box) were replaced 
quantization parameter is one half for the boundary mac- 20 with the chroma-key at step 520, these replaced pixels 
roblocks as for other macroblocks resulting in finer quanli- located outside the bounding box are simply discarded (but 
zation of the boundary macroblocks. Other values for the the first__shape_codes indicate which macroblocks have no 
bound_quant code specify various other number of quan- data transmitted for them). Coding includes DCT transform- 
tization parameters for boundary macroblocks. Other tech- ing the luminance and chrominance values for the macrob- 
niques can be used to specify an increased number of 25 locks to obtain DCT coefficients, and then quantizing 
quantization levels (Oner quantization) for boundary mac- (scaling) the DCT coefficients. ITie motion vectors and the 
roblocks (as compared to other macroblocks). scaled DCT coefficients are then variable length coded. The 

5. Choice of Chroma-Key Color: steps of DCT transforming, quantizing (generally), perform- 

Although the choice of key color is an encoding issue, it ing motion compensation, calculating motion vectors and 
has the potential of causing shape degradation due to poten- 30 variable length coding can be performed, for example, in a 
tial color leakage if saturated colors arc used. (Saturation is manner similar to that set forth in MPEG-4 VM 5.0. Accord- 
the degree of purity of a color; for example, a pure spectral ing to an embodiment of the present invention, boundary 
color having a single wavelength has a saturation of 100%, macroblocks can be quantized using finer quantization than 
while white light has a saturation of zero). On the other macroblocks inside the object. 

hand, use of a saturated color improves shape recovery 35 At step 540, a coded bit stream is output from the 
because natural scenes do not often contain such colors. encoding system to the channel. The bit stream includes the 
However, the only restriction for chroma-keying is that the transformed and quantized (scaled) luminance and chromi- 
chroraa-key color does not exist in the scene. The use of less nance data for each coded macroblock, motion vectors, 
saturated colors has been investigated, similar to the ones codes (such as the first_shape_a)de) identifying the posi- 
used in studio environments for chroma-keying of scenes. 40 tion or mode (e.g., inside, outside or on the boundary) of 
A relatively saturated color can be used, such as Y"50, each macroblock, a code (such as the VOP_quant code) 
Cb=200, Cr=100. However, weaker colors (less saturated) indicating the level of quantization for macroblocks located 
can be used to reduce the potential for shape distortion due inside the object and a code (such as the bound_quant code) 
to color bleeding. According to an embodiment of the indicating the relative level of quantization for boundary 
present invention, an example of a less saturated color that 45 macroblocks (if different), motion vectors, and the chroma- 
can be used to decrease the potential for color bleeding is key and threshold values. The bit stream can also include 
Y«135, Cb=160, Cr«110. Other less saturated colors can be additional information. For boundary macroblocks, pixels 
similarly used as the chroma-key color to decrease the located outside the object have been replaced with the 
potential for shape distortion. For notational simplicity, chroma-key color so as to implicitly code the shape of the 
instead of using Cb and Cr, the notations of U and V, 50 object within the texture information (luminance and 
respectively, will be used in the remainder of this application chrominance data) for the object. 

(although strictly speaking Cb and Cr differ from U and V To reduce overhead and improve data compression, mac- 
by a small scaling factor and an oflset). roblocks located outside the object (e.g., background 

FIG. 5 is a (low chart illusUrating the operation of an macroblocks) are not coded, and the chroma key is applied 
encoding system according to an embodiment of the present 55 to background pixels only for boundary macroblocks. In 
invention. addition, a finer quantization can be used for boundary 

At step 510 a video frame is received and a video object macroblocks to improve image quality, 
is segmented from the remainder of the video frame. One of FIG. 6 is a flow chart illustrating the operation of a 
several well known techniques can be used to segment the decoding system according to an embodiment of the present 
object. 60 invention. 

At step 515, a bounding box is created around the video At step 610, the bit stream is received from the channel, 
object (VOP). The position of the bounding box is chosen and the variable length codes are decoded to obtain the 
such that it contains a minimum number of 16 pixelxl6 pixel scaled DCT coefficients, motion vectors (MVs), codes iden- 
macroblocks. Other size macroblocks can be used. Process- tifying the location or mode of macroblocks (e.g., first_ 
ing is performed on a macroblock basis. 65 shape_code), quantizers (e.g., VOP_quant, bound_quant), 

At step 520, each background pixel (pixels outside the and chroma-key color and thresholds. Image data is not 
object) is replaced with the chroma-key color K. This can be provided for the identified background macroblocks. 
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At step 615, the data (including DCT coefficients and 
motion vectors) for each macroblock is inverse quantized 
(rescaled) based on the bound_quani code (for boundary 
raacroblocks) and the VOP_quant code (for macroblocks 
inside the object). 5 

At step 620, the DCT coefficients are inverse DCT 
transformed, and motion compensation is performed based 
on the motion vectors (for those macroblocks coded with 
motion compensation) to generate motion compensated 
luminance and chrominance pixel values for macroblocks 10 
inside the object and on the object boundary. This can be 
performed, for example, as specified by MPEG-4 VM. 

At step 622, the chroma-key and thresholds (described in 
greater detail in the example below) are decoded. 

At step 625, the reconstructed video object is recovered, 15 
and the shape of the object is recovered. The reconstructed 
video object can be recovered by passing only pixel values 
that arc not equal to the chroma-key color (or not within a 
small range of the chroma-key color. This passes only the 
object pixel data. 20 

Object shape information can be recovered by generating 
a shape mask or a segmentation map, indicating which 
pixels are part of the object, and which pixels are not. 
According to an embodiment of the present invention the 
segmentation map can be generated as a binary segmenta- 25 
lion map. 'llie binary segmentation map can be generated by 
determining whether or not each pixel value is near the 
chroma-key value K. If a pixel is near the chroma-key value 
(e.g., within a threshold T of the chroma-key value), then the 
pixel is not included in the recovered video object or frame. 30 
If the pixel is not near the chroma key value (e.g., the pixel 
value is not within a threshold of the chroma -key value), 
then the pixel is included in the recovered video object 
(considered foreground). The video object has the shape 
indicated by the binary segmentation map and a texture 35 
(luminance and chrominance values) indicated by those 
decoded pixel values which are not near the chroma-key 
value. If the first_shape_code indicates which macroblocks 
are on the object boundary, then color extraction (e.g., 
comparison of the pixel to the chroma-key to determine if 40 
the pixel is inside or outside the boundary) need only be 
performed for boundary macroblocks to obtain a binary map 
identifying the shape of the object. 

One problem with the use of a single threshold T to 
generate a binary segmentation map at the decoder for 45 
chroma-keying is that the sharp boundary condition can 
cause a rough or jagged edge for the object boundary. 
Instead of a binary map as described above, the segmenta- 
tion map can have gray -level values to create softer bound- 
aries. In computer graphics or in blue-screen movies, alias- so 
free natural looking boundaries can be generated using two 
thresholds instead of one at the boundary regions. 

According to another embodiment of the present 
invention, instead of using a single threshold T at the 

decoding system, two thresholds Tj and T2 can be used. The 55 box is chosen such that it contains a minimum number of 



For example, human interaction and subjective determina- 
tions can be used at the encoding system to select the 
thresholds T^ and T2. T^ can be set equal to T2 to create the 
step function or sharp boundary condition provided by the 
binary segmentation map. 

Using two thresholds Tj and Tj, the shape information 
can be recovered from the reconstructed texture information 
as follows: 

1) Calculate an alpha value for a decoded pixel (X) by 
either of two methods: 

Method 1: dKKi^Xy)^+(K^Xt,)'+(Kv^Xv)^ default 
method 

Method 2: dj-|Kj^Xv|+|Ky-Xy|+]Kv^Xv|; alternate 
method. 

If method 2 is employed 'd/ needs to be multiplied by a 
scaling factor (^d) to fit the same range as *d* computed by 
method 1, with respect to which thresholds T^ and Tj are 
sent. 

2) The alpha value (a) for each pixel is a function of 
distance d between the reconstructed YU V values of pixel X 
and the key color K: 

if (d<Ti) then a=0; 

else if (J,<d<J^ then a=(d-T0/(T2-Tj) X255; 
else if (d>T2) then a-255. 

The values T^ and T2 are set by the encoder (assuming 
method 1 for computing d) and sent to the decoder as side 
information. According to an embodiment of the present 
invention, acan denote the transparency of a pixel, where a 
being 255 indicates that the object is opaque, and a being 0 
indicates that the pixel is transparent. The resulting value for 
a pixel that has an a somewhere between 0 and 255 is 
semi-transparent and is a weighted combination of the pixel 
value in the current picture and the pixel value from a 
background picture that is specified externally or in advance. 
This allows a smoothing or blending function to be per- 
formed at object boundaries. ITius, the resulting pixel value 
for each component (Y, U and V) can be calculated as: 

{a-X+(255-a)-Z }/255 

where X is the decoded pixel component value (Xy, X^ or 
Xv), and Z is the pixel component value (Zy, Z^j or Z^) for 
each component of the background picture. This calculation 
should be performed for each component value (Y, U, V). 

The system of the present invention can include an 
encoding system 202 and a decoding system 230. Encoding 
system 202 uses chroma-key shape coding to implicitly 
encode shape information. Encoding system 202 includes a 
bounding box generator and color replace r 206, a macrob- 
lock formatter and mode decider 207, a DCT encoder 210, 
a quantizer 214, a motion estimator/compensator 209 and a 
variable length coder 218. A video object to be encoded is 
enclosed by a bounding box and only macroblocks in the 
bounding box are processed. The position of the bounding 



region between T^ and T2 is the boundary. A value of 0 
indicates background and a value of 255 indicates fore- 
ground (the object), assuming 8 bits of coding per pixel 
(merely as an example). Note that Tj affects the amount of 
background while T2 affects the amount of foreground. If T2 60 
is too high, part of the foreground will be too high. If T^ is 
too low, part of the background will be included in the 
object, and hence introduce artifacts. On the other hand, if 
Ti and T2 are too close to each other, then the object 

boundary becomes harder (losing the advantages of bound- 65 implicitly encode the shape of the object, llie luminance and 
ary softening). 'Ilie tradeoffs among these factors can be chrominance values for macroblocks inside the object and 
used to select the best thresholds for a particular application. on the boundary are coded. Coding includes, for example, 



macroblocks. 

The encoding/decoding process is performed macroblock 
by macroblock. To increase data compression, macroblocks 
outside the bounding box are not coded. 

A code can be iised to identify each macroblock inside the 
bounding box as either 1) outside the object; 2) inside the 
object; or 3) on the object boundary. For boundary 
macroblocks, pixels located outside the object (e.g., back- 
ground pixels) are replaced with a chroma-key color K to 
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transforming the luminance and chrominance values to 
obtain DCT coefficients, and quantizing (scaling) the DCT 
coefficients. Motion compensation can also be performed on 
macroblocks to generate motion vectors- In addition, bound- 
ary macroblocks can be quantized at a finer level to improve 5 
image quality. A bitstream is output from encoding system 
202. The bitstream includes the transformed and quantized 
(scaled) luminance and chrominance data for each coded 
macroblock, motion vectors, codes (such as the first_ 
shape_code) identifying the position (e.g., inside, outside or 10 
on the boundary), a quantizer code (such as the VOP_quant 
code) indicating the number of quantization levels for mac- 
roblocks located inside the object and a quantizer code (such 
as the bound_quant code) indicating the number of quan- 
tization levels for boundary macroblocks (if different). is 

Decoding system 230 includes a variable length decoder 
232, an inverse quantizer 236, a motion compensator 242, an 
inverse DCT 240, and a color extractor and shape mask 
generator 249. A bitstream is received and decoded by 
decoding system 230 is used to obtain both texture infor- 20 
mation (e.g., luminance and chrominance data) and shape 
information for a video object. The shape information is 
implicitly encoded. DCT coefficients and motion vectors for 
each macroblock are requantized (rescaled) based on the 
bound_quant code (for boundary macroblocks) and the 25 
VOP_quanl code (for macroblocks inside the object). 
Motion compensated luminance and chrominance values are 
generated based on the motion vectors. A color extractor and 
shape mask generator 249 reconstructs the video object by 
passing only pixel values that are different from the chroma- 30 
key color, and generates a shape mask (identifying the shape 
of the object), also by comparing pixel values to the chroma- 
key color. These two processes can be performed together. 
The shape of the object (and thus, an identification of the 
object itself) can be determined by comparing each pixel 35 
value with the chroma-key value K. If a pixel is within a 
predetermined threshold of the chroma-key value, the pixel 
is not included in the recovered video object or frame 
(rather, it is considered background). If the pixel is not 
within a threshold of the chroma-key value, then the pixel is 40 
included in the recovered video object (considered 
foreground). The shape of the video object is thus recovered 
(e.g., by generating a binary shape mask at the decoding 
system based on the pixel value comparison). For example, 
the binary shape mask can be generated as Is for object data 45 
and Os for the other (background) pixels. The texture of the 
object is recovered as the decoded luminance and chromi- 
nance values of the object (e.g., pixel values outside the 
threshold of the chroma-key value are output as texture data 
of the object). Also, a gray-scale segmentation map can be 50 
generated using two thresholds to soften the object bound- 
aries. 

What is claimed is: 

1. A method of implicitly encoding shape information for 
a video object, comprising the steps of: 55 

receiving a video frame, including a video object; 

creating a box bounding the video object, the bounding 
box divided into a plurality of macroblocks, each 
macroblock comprising a plurality of chrominance and 
luminance pixels; 

identifying which macroblocks are inside the object or on 
the object boundary; 

for each boundary macroblock, replacing each pixel out- 
side the object with a key color; 65 

for boundary macroblocks and macroblocks inside the 
object, computing luminance and chrominance pixel 
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difference values by subtracting motion compensated 
prediction signals from the corresponding luminance 
and chrominance pixel values; 
for boundary macroblocks and macroblocks inside the 
object, transforming the luminance and chrominance 
pixel difference values to frequency domain coeffi- 
cients; 

scaling the coefficients for macroblocks inside the object 

using a first quantizer; 
scaling the coefficients for boundary macroblocks using a 

second quantizer to provide a finer level of quantization 

for said boundary macroblocks as compared to said 

macroblocks inside the object; and 
outputting a bitstream including the scaled coefficients 

and information identifying the quantizers. 

2. A method of implicitly encoding shape information for 
a video object comprising the steps of: 

receiving a video frame, including a video object; 

creating the tightest box bounding the video object, 
extending the box in horizontal and vertical directions 
to fit the next integer number of macroblocks in each 
direction, the extended bounding box divided into a 
plurality of macroblocks, each macroblock comprising 
a 16x16 array of luminance pixels in the form of 4, 8x8 
blocks and the corresponding chrominance pixels; 

identifying which macroblocks are inside the object or on 
the object boundary; 

for each boundary macroblock, replacing each pixel out- 
side the object with a key color; 

for boundary macroblocks and macroblocks inside the 
object, computing luminance and chrominance pixel 
difference values by subtracting motion compensated 
prediction signals from the corresponding luminance 
and chrominance pixel values; 

for boimdary macroblocks and macroblocks inside the 
object, transforming the luminance and chrominance 
pixel difference values to frequency domain coeffi- 
cients; 

scaling the coefficients for macroblocks inside the object 
using a first quantizer; 

scaling the coefficients for boundary macroblocks using a 
second quantizer, wherein the second quantizer is 
smaller than or equal to the first quantizer to provide a 
finer level of quantization for said boundary macrob- 
locks; and 

outputting a bitstream including the scaled coefBcients 
and information identifying the quantizers. 

3. The method of claim 1 wherein the key color is chosen 
to be from among the less saturated colors and the key color 
does not exist in the object. 

4. The method of claim 1 wherein said bitstream further 
comprises a first_shape_code provided for at least some of 
the macroblocks and efficiently identifying which of the 
macroblocks are inside the object and identifying which 
macroblocks are outside the object. 

5. The method of claim 1 wherein said bitstream further 
comprises a firsl__shape_code provided for each macrob- 
lock and efficiently identifying which of the macroblocks are 
inside the object, outside the object or on the boundary of the 
object. 

6. The method of claim 1 and further comprising the step 
of variable length coding the scaled coefficients and said 
information. 

7. 'ITie method of claim 1 wherein said bitstream com- 
prises coded motion vectors, transformed and scaled lumi- 
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nance and chrominance pixel difference values, and codes 
indicating the quantizers for boundary macroblocks and 
other macroblocks inside the bounding box, and an identi- 
fication of the macroblocks outside the object. 

8. The method of claim 1 wherein said step of transform- 5 
ing comprises the step of discrete cosine transform (DCT) 
transforming the luminance and chrominance values to DCT 
coefiGcients for boundary macroblocks and macroblocks 
inside the object. 

9. The method of claim 1 wherein: 10 
said step of scaling the coefficients for macroblocks inside 

the object using a first quantizer comprises the step of 
dividing the coeflScients for macroblocks inside the 
object by the first quantizer; and 
said step of scaling the coefficients for boundary macrob- is 
locks using a second quantizer comprises the step of 
dividing the coefficients for boundary macroblocks by 
the second quantizer, wherein the second quantizer is 
less than or equal to the first quantizer. 

10. A method of decoding a video bitstream in which the 20 
shape of a video object has been implicitly encoded, com- 
prising the steps of: 

receiving a bitstream representing a video object, the 
bitstream including scaled frequency domain coeffi- 
cients for each of a plurality of macroblocks inside the 25 
object or on the object boundary; 

rescaling the coefficients for macroblocks inside the 
object using a first quantizer; 

rescaling the coefficients for macroblocks on the object 
boundary using a second quantizer wherein the second ■^^ 
quantizer is smaller than or equal to the first quantizer; 

inverse transforming the frequency domain coefficients to 
obtain luminance and chrominance pixel difference 
values; 

. 35 

adding a prediction signal generated by a motion com- 
pensator to the luminance and chrominance pixel dif- 
ference values to obtain the luminance and chromi- 
nance pixel values of a reconstructed video object; and 

recovering the approximate shape of the object by ana- 
lyzing the luminance and chrominance values of at 
least the boundary macroblocks of the reconstructed 
video object. 

11. The method of claim 10 wherein each macroblock 
comprises a 16x16 array of luminance pixels in the form of 
4, 8x8 blocks and the corresponding chrominance pixels, 

12. The method of claim 10 wherein said step of inverse 
transforming comprises the step of inverse discrete cosine 
transform (DCT) transforming the frequency domain coef- 
ficients to obtain the luminance and chrominance pixel 
difference values. 

13. The method of claim 10 wherein said step of recov- 
ering the approximate shape of the object comprises the 
following steps: 

decoding the chroma-key value and a threshold from the 55 
bitstream; 

comparing each pixel value of the boundary macroblocks 
of the reconstructed object to the chroma-key value; 

if the pixel value is within a threshold of the chroma-key 
value, then the pixel is not included in the recovered eo 
video object; 

if the pixel is not within the predetermined threshold of 
the chroma-key value, then the pixel is included in the 
recovered video object. 

14. 'llie method of claim 10 wherein said step of recov- 65 
ering the approximate shape of the object comprises the 
following steps: 
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decoding the chroma -key value and first and second 
thresholds T^ and Tj from the bitstream; 

calculating an alpha map based on the pixel luminance 
and chrominance pixel values of the reconstmcted 
object, the chroma-key color and the first and second 
thresholds; and 

applying the alpha map to the pixel luminance and 
chrominance pixel values to obtain final luminance and 
chrominance values. 

15. The method of claim 14 wherein said step of calcu- 
lating an alpha map comprises the following steps applied 
either to object boundary macroblocks or to object boundary 
as well as inside the object macroblocks: 

A) Calculate an alpha value for a decoded pixel (X) by 
first computing the distortion measure: 

wherein Kj^ K^^ and K^, represent luminance and chromi- 
nance values for the chroma-key color K, and wherein X^, 
Xjy and X^/ represent luminance and chrominance values for 
a pixel. 

16. The method of claim 14 wherein said step of calcu- 
lating an alpha map comprLses the following steps applied 
either to object boundary macroblocks or to object boundary 
as well as inside the object macroblocks: 

A) Calculate an alpha value for a decoded pixel (X) by 
first computing the distortion measure: 

di-lK^Xy|+|Kt^Xj+lK^Xv^; 

wherein Ky, K^^ and Ky represent luminance and chromi- 
nance values for the chroma-key color K, and wherein Xy, 
Xj/and Xv represent luminance and chrominance values for 
a pixel; and 

multiply dj by a scaling factor. 

17. The method of claim 15 wherein said step of applying 
comprises the steps of: 

B) calculate the alpha value (a) for each pixel in the said 
macroblocks as a function of distance d between the 
reconstructed pixel luminance and chrominance values 
(YUV) and the chroma-key color K (using Ky, K„, K^, 
and thresholds T^ and Tj) 

if (d<Ti) then a«0; 

else if (T3<d<T2) then a=(d-T0/(T2-Ti)x255; 
else if (d>T2) then a«255; and 
assigning a=0 to pixels of macroblocks outside the object 
and a«255 to pixels of macroblocks inside the object if 
not already assigned a value by above equations; and 

C) calculate the final pixel luminance and chrominance 
values for the reconstructed object as follows: 

pixel value={a-X+(255-a>Z]/255 

wherein Z is the corresponding background pixel. 

18. llie method of claim 10 wherein: 

said step of reseating the transformed coefficients for 
macroblocks inside the object using a first quantizer 
comprises the step of multiplying the transformed 
coefficients by the first quantizer; and 

said step of rescaling the transformed coefficients for 
macroblocks on the object boundary using a second 
quantizer comprises the step of multiplying the trans- 
formed coefficients for boundary macroblocks by the 
second quantizer. 

* * * ♦ ♦ 
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